Fast transcription of unstructured audio recordings

نویسندگان

  • Brandon Roy
  • Deb Roy
چکیده

We introduce a new method for human-machine collaborative speech transcription that is significantly faster than existing transcription methods. In this approach, automatic audio processing algorithms are used to robustly detect speech in audio recordings and split speech into short, easy to transcribe segments. Sequences of speech segments are loaded into a transcription interface that enables a human transcriber to simply listen and type, obviating the need for manually finding and segmenting speech or explicitly controlling audio playback. As a result, playback stays synchronized to the transcriber’s speed of transcription. In evaluations using naturalistic audio recordings made in everyday home situations, the new method is up to 6 times faster than other popular transcription tools while preserving transcription quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Does the recording medium influence phonetic transcription of cleft palate speech?

BACKGROUND In recent years, analyses of cleft palate speech based on phonetic transcriptions have become common. However, the results vary considerably among different studies. It cannot be excluded that differences in assessment methodology, including the recording medium, influence the results. AIMS To compare phonetic transcriptions from audio and audio/video recordings of cleft palate spe...

متن کامل

Ecological Acoustics Perspective for Content-Based Retrieval of Environmental Sounds

In this paper we present a method to search for environmental sounds in large unstructured databases of user-submitted audio, using a general sound events taxonomy from ecological acoustics. We discuss the use of Support Vector Machines to classify sound recordings according to the taxonomy and describe two use cases for the obtained classificationmodels: a content-based web search interface fo...

متن کامل

Teachers’ Strategies Used to Foster Teacher-Student and Student-Student Interactions in EFL Conversation Classrooms: A Conversation Analysis Approach

Despite the fact that there are a wide range of strategies used to foster interactions in EFL conversation classrooms, many novice teachers are not aware of them. In view of this problem, the current study aimed to identify such strategies commonly used by EFL teachers in conversation classrooms. To this end, fifty sessions of college level conversation classrooms were observed andtheir teacher...

متن کامل

Automated quantisation and transcription of Ornaments from audio recordings

We propose a new method for rhythm quantisation and measurement of expressive timing. This paper focuses on the automatic quantisation and rhythmic transcription of syncopated rhythms and baroque ornaments, e.g. appogiaturas, mordants and trills from time-tagged audio recordings without knowing the score in advance. We demonstrate the transcription of the Aria of J. S. Bach’s Goldberg Variation...

متن کامل

Evaluation of Two Mobile Nutrition Tracking Applications for Chronically Ill Populations with Low Literacy Skills

In this chapter, we discuss two case studies that compared and contrasted the use of barcode scanning, voice recording, and patient self reporting as a means to monitor the nutritional intake of a chronically ill population. In the first study, we found that participants preferred unstructured voice recordings rather than barcode scanning. Since unstructured voice recordings require costly tran...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009